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ABSTRACT. Software components turn out to be a convenient model to build complex applications 
for scientific computing and to run them on a computational grid. However, deploying complex, 
component-based applications in a grid environment is particularly arduous. To prevent the 
user from directly dealing with a large number of execution hosts and their heterogeneity within 
a grid, the application deployment phase must be as automatic as possible. This paper describes 
an architecture for automatic deployment of component-based applications on computational 
grids. In the context of the CORBA Component Model (CCM), this paper details all the steps to 
achieve an automatic deployment of components as well as the entities involved: a grid access 
middleware and its grid information service {like OGSI), a component deployment model, as 
specified by CCM, an enriched application description and a deployment planner in order to 
select resources and map components onto computers. 

RESUME. Les composants logiciels sont une solution bien adaptee pour construire des appli- 
cations complexes de calcul scientifique destinies a etre executees sur une grille de calcul. 
Cependant, le deploiement d' applications complexes a base de composants sur une grille est 
une tdche particulierement ardue. Pour eviter d' avoir a faire face directement au grand nom- 
bre d'ordinateurs de la grille et a leur heterogeneite, la phase de deploiement d 'application doit 
etre automatisee. Cet article decrit une architecture de deploiement automatique d' applications 
a base de composants sur grille de calcul. En partant du modele de composants CORBA ( CCM ), 
ce papier detaille les etapes du deploiement de composants et les acteurs en presence : un in- 
tergiciel d'acces awe ressources de la grille (a I'instar de OGSI), un modele de deploiement de 
composants, une description etendue de V application et un planificateur de deploiement. 
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1. Introduction 

Modern software development approaches are often suspected of not providing 
the level of performance which high-end parallel computers would offer. However, 
new scientific applications become more and more complex, involving several sim- 
ulation codes coupled together to obtain more accurate simulations. For example, 
multi-physics simulations aim to simulate various physics, each of them implemented 
by a dedicated code, to increase the accuracy of simulation. It is becoming clear that a 
radical shift in software development should occur to handle the increasing complex- 
ity of such applications. Moreover, the computing infrastructure should provide the 
level of performance to running such applications within a reasonable time frame. A 
computational grid is by no doubt a computing infrastructure that could deliver this 
level of performance. It is a set of high-performance computing resources connected 
to the Internet and managed by a middleware that gives transparent access to resources 
wherever should they be located in the network. 

Software components turn out to be a convenient model to build multi -physics ap- 
plications for scientific computing and to run them on a computational grid I PeR 031 
IARM 9 9 1 . Each simulation code can be encapsulated into a component. Such an ap- 
proach raises several difficult problems such as encapsulation of parallel simulation 
codes into software components and efficient communication between components in 
the presence of various high-performance networking technologies. We already pro- 
posed solutions IPeR 031 IDEN 031 to those problems in the context of the Corba 
component model (CCM) |Obj 02| . For better acceptance of component-based ap- 
plications running on a grid, the deployment phase should be as automatic as pos- 
sible while taking into account application constraints (memory, etc.) and/or user 
constraints. While environments like ProActive IBAU 02 i are able to deal with Grid 
middleware, they do not support application and/or user constraints: the mapping of 
virtual nodes to physical nodes has to be provided manually and network constraints 
seem difficult to handle. 

This paper presents an architecture for automatic deployment of component-based 
applications on computational grids. Section |2] details all the necessary entities and 
their relationships to achieve an automatic deployment of components. Examples of 
these entities are presented with respect to the prototype we are currently developing. 
Before the conclusion, Section[3]gives an overview of the upcoming challenges. 

2. Architecture for Automatic Deployment of Components 

The CORBA component model contains a deployment model that specifies how 
a particular component can be installed, configured and launched on a machine. The 
specifications do not deal with the problem of selecting machines and rely on a Server- 
Activator daemon to actually launch component servers. 

The proposed architecture aims to describe the entities needed for an automatic 
deployment as well as their relationships. These entities can be grouped into three 
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parts, each of them actually corresponding to a phase of the deployment process: the 
inputs (the component assembly and a grid resource description), the planner, which 
selects the resources and maps each component on a computer, and the actual deploy- 
ment of the components on the selected resources. This section reviews these entities 
and mentions a few issues which have already been tackled within our prototype. 



2.1. Information Description 

Two pieces of information are required for automatic deployment: a description of 
the component-based application to deploy and a description of the grid resources on 
which the application may be deployed. 

2.1.1. Component-Based Application Description 

Within the context of the CORBA Component Model (CCM, |Obj 02| ), an applica- 
tion is made of a set of components, called a component assembly package. It is an 
archive provided by the user to the deployment tool. It includes, among other files, the 
assembly description which describes all the components of the assembly and their 
interconnections. 

The assembly and component descriptors can express various requirements such 
as the processor architecture and the operating system required by a component im- 
plementation. A component may have environmental or other dependencies, like li- 
braries, executables, Java classes, etc. Another possible requirement is component 
collocation: components may be free or partitioned to a single process or a single 
host, meaning that a group of component instances will have to be deployed in the 
same process or on the same compute node. 

2.1.2. Grid Resources 

Information about grid resources includes not only compute and storage resource 
information, but also network description. While compute and storage resource de- 
scription is rather well mastered (computer architecture, number and speed of CPUs, 
operating system, memory size, storage capacity, etc.), network description received 
less attention. We have proposed ILAC 04bl a scalable model for grid network topol- 
ogy description and have implemented it on top of MDS2 ICZA Oil , the information 
service of the Globus Toolkit BGlol version 2. 

The deployment tool requires a pointer to a resource information service to be 
able to automatically find adequate resources. Depending on the type of resource 
information service, the pointer can be a path to a local file, a URL, or a distinguished 
name (DN), host and port to access an LDAP tree. Our prototype ILAC 04bl currently 
supports local file access, HTTP(S) and (GSI)FTP protocols as well as LDAP / MDS2 
query. 
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2.2. Deployment Planning 

The deployment planner is responsible for 1) selecting the computers which will 
run the components and the component servers, 2) selecting the network links (or 
network technology) to interconnect the components, and 3) mapping the component 
servers onto the selected computers. The input of the deployment planning algorithm 
is made of the application description and the resource description, as explained in 
Subsection l2.ll 

The output of the deployment planner is a deployment plan that describes the map- 
ping of the components onto component servers and the mapping of these component 
servers onto the computers of the grid. The deployment plan should also specify 1) in 
what order processes must be launched by the deployment tool, 2) how data must 
flow from the output of certain processes to the input of other processes, 3) what net- 
work connections must be established between every pair of processes. For instance, 
items 1) and 2) are necessary for Corba applications, where a Naming Service needs 
to be launched, and its reference needs to be passed to the other processes. 

Our prototype is currently based on a simple round-robin deployment planning 
algorithm. It is just a proof of concept. 

2.3. Actually Launching Components on a Grid 

Once the deployment plan has been obtained from the previous step, the component- 
based application is launched and configured according to the CORBA component 
model. The technical point is that the selected machines are assumed not to contain 
any component activator or component server. That is why a job submission method 
is needed to launch this very first process. This step is fully compatible with the 
CCM deployment model ILAC (Hal but needs more work to comply with the MDA 
deployment specification IMDAI . The deployment tool manages two sorts of handles: 
CORBA references and handles returned by the grid access middleware. Both are use- 
ful to control application processes, like cancel, suspend, or restart their execution. 

To face the diversity of grids, the deployment tool should support various grid 
access middleware such as the Globus Toolkit ICZA 981 . OSGA, Condor FRE01I . 
etc. Our prototype illustrates how CORBA components can be deployed on a compu- 
tational grid using the Globus Toolkit IIGlol : more details are provided in ILAC (Hal . 

3. Efficient Automatic Deployment 

While a few issues have already been addressed as mentioned in Section [2] there 
remains a number of issues to achieve an efficient automatic deployment which are 
mainly related to the constraints attached to an application. The central issue is to un- 
derstand what a user expects from an automatic deployment tool and what is possible, 
like prediction of the behavior of a component FUR 021 for example. 
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3.1. Enriching the Application Description 

The constraints attached to a component assembly package mainly focus on en- 
abling the execution of the component. New kinds of constraints could be useful, like 
communication requirements (latency, bandwidth, etc.) or a description of the behav- 
ior of the application with respect to specific resources IFUR 021 . CCM specifications 
allow new constraints to be added to the component assembly package as in IWAN 031 
for example. A major issue is to standardize useful constraints. 

3.2. Taking User-Level Constraints into Account 

The deployment planning algorithm (see Subsection l2.2t requires a goal IIKIC 041 
to produce a deployment plan. For example, do we want to minimize the execution 
time or do we want the application to run at a particular site, close to a visualization 
node? Those constraints are not specific to the application itself, they are user-level 
constraints. They belong neither to the application description nor to the grid resource 
description. To take them into account, a third kind of information needs to be defined. 

3.3. Deployment Planning Algorithm 

The deployment planning algorithm of our prototype is too simple to satisfy the 
constraints mentioned above.. More sophisticated algorithms like Sekitei IIKIC 041 
exist, but the question is to determine if they are suitable for our purpose. To sup- 
port a variety of application-level and user-level constraints, the planner needs to be 
very customizable. Do we need a general purpose deployment planning algorithm? 
Or do we need a collection of specialized algorithms? The latter solution may give 
better results, since we can imagine that an application may provide its own fine-tuned 
deployment algorithm. 

4. Conclusion 

On the one hand, software component technologies appear to be a convenient 
model to handle the complexity of multi-physics simulations. On the other hand, 
grids promise to offer the necessary level of performance for such applications. This 
paper has presented an architecture to achieve automatic deployment of component- 
based applications in a grid environment. The central entity is the deployment planner 
which has to select resources and map components on them to achieve a goal. The 
planner requires the description of both the component assembly and grid resources. 
It generates a deployment plan which controls the CCM deployment with the help of 
a job submission method. Remaining issues of our ongoing work include the defini- 
tion of useful application-level constraints, management of user-level constraints, and 
integration of efficient deployment planning algorithms. 
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